Scripting: Python and R
You can create scripts using Python or R by adding the relevant node to your data flow and add the output as columns to the existing table, or to a new table.
The R and Python nodes may only be connected to Select, Query, or Multi Select nodes. Using the R node you can create or reference a "regular script," while the Python node can also be used to create learn and predict scripts.
There are three ways to configure your script in the Properties panel:
- Write a script in the Script window
- Click the shopping cart icon to download a script from the Marketplace
- Click the folder icon to select an existing script that's been saved to the CMS
Packages
R
When writing R scripts, you must specify the packages that should be downloaded.
Python
Python packages are managed from the Admin console under Scripting Environments. Simply select the required environment from the relevant drop down in the Properties panel; click the Packages button see which packages have been downloaded to the given environment.
The Regular Script function allows you to inject scripts into a table, during the ETL. The results will then be added to the selected table as an additional column, or you can use your script to create a new table.
You can write your Python or R script, or download a script from the marketplace.
In the screenshot below, the data frame is made up of the Age, Daily Rate, and Distance from Home columns. A DBSCAN clustering algorithm has been applied, and the resulting column (clustering number in this example) can later be analyzed in Discover.
Continue reading for step-by-step instructions on how to build a Regular script.
STEP 1
Add the node to the table you want to apply a script to, and from the Properties panel, select Regular script.
STEP 2
In the Script window, enter your script. If writing an R script, go to Packages Names and enter the name of any packages you want to base your script on. If using multiple packages, split each one with a comma. When entering package names, Pyramid will download the corresponding package(s) and load them into the environment. You also have the option to leave this blank, and add the appropriate lines to the script.
If using Python to write a Regular script, select the required environment.
STEP 3
Click the plus sign under input. In From, select the column you want to apply the script to. From To, enter the name that will be used in the script, and click Apply.
STEP 4
To manually configure the output columns, click the plus sign under Output. Name the new column as written in the script, and enter the output type (float, string, or integer).
Alternatively, click Auto Detect to have Pyramid automatically detect the output columns. See Data Frame Support for more information.
Under Output Options, choose whether to add your column(s) to the existing table, or to create a new table.
STEP 5
Press the to preview your column.
The Select Script function allows you to apply an existing script, that was created using the shared scripting feature in Formulate and saved to the CMS, to a table during the ETL.
STEP 1
After adding the Python or R node to the relevant table, click the folder icon to open the Script Files dialog.
STEP 2
Find and select the required script from the Script Files Dialog.
STEP 3
Under Input, select the relevant source(s), by clicking the target. Under From, select the required column.
STEP 4
Under Output, select the corresponding output type. Under Output Options, choose from adding your column to the existing table, or creating a new table.
STEP 5
Press the to preview your column.